Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPOPT via cyipopt #2368

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

jduerholt
Copy link
Contributor

Motivation

As discussed already several times with @Balandat, it could be for some problems (especially large ones with constraints) beneficial to use IPOPT instead of scipys SLSQP or L-BFGS-B optimizers. Of course this has to be tested (also with different solvers like HSL within IPOPT (https://coin-or.github.io/Ipopt/INSTALL.html#DOWNLOAD_HSL)).

The idea behind this PR is to bring IPOPT via its python wrapper (cyipopt, https://pypi.org/project/cyipopt/) into botorch.

cyipopt offers a handy interface which behave to 99% percent as scipy.optimize.minimize, so I currently just copied gen_candidates_scipy and created a method gen_candidates_ipopt. Ideally one would merge them into one method.

The following problems exists for a common method both for scipy and ipopt optimization:

  • Currently the optimization timeout is implemented via a callback method passed to scipy.optimize.minimize. cyipopt is not supporting callback methods (in short: the keyword is ignored, https://cyipopt.readthedocs.io/en/stable/reference.html#cyipopt.minimize_ipopt). A solution could be to implement the timeout method via the objective function and incorporate it into f_np_wrapper and remove the current minimize_with_timeout. What do you think?
  • It is recommend to install cyipopt via conda, if installed via pip only the wrapper is installed and not the actual solver, so cyipopt would be somehow an optional dependency of botorch which is imported on execution. An alternative would be to be able to provide a callable as argument to the new version of gen_candidates_scipy which has to offer the same signature as scipy.optimize.minimize which is then used for the actual optimization. This would make it even more flexible and prevent the need for the cyipopt dependency. Only need would be to refactor the timeout functionality.

Additional comments and questions:

  • IPOPT can exploit hessians both for constraints and for actual objective function, this is currently not yet implemented, but would be interesting to see.
  • Any ideas regarding proper benchmarking of IPOPT in botorch?

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

Not yet implemented, as the current implementation is still experimental and has to be finalized.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jun 10, 2024
@jduerholt
Copy link
Contributor Author

Regarding the timeout, if you do not want to integrate it into the function callable, one could also just ignore it, which means that in the IPOPT case no timeout can be used.

Regarding hessians, this would be the way to integrate them so that no extra call has to be made for them: https://stackoverflow.com/questions/68608105/how-to-compute-objective-gradient-and-hessian-within-one-function-and-pass-it-t

Do you have any experience with computing hessians of the acqf with respect to the inputs via autograd? Is this feasible?

Best,

Johannes

@Balandat
Copy link
Contributor

Thanks for putting this up, excited to see some other optimizers that could be useful in cases where SLSQP just ends up being too slow.

A solution could be to implement the timeout method via the objective function and incorporate it into f_np_wrapper and remove the current minimize_with_timeout. What do you think?

Hmm not a bad idea. I think for a v0 we can probably ignore the timeout complications for now, we can follow up to support timeouts once we are sure everything else works.

It is recommend to install cyipopt via conda, if installed via pip only the wrapper is installed and not the actual solver, so cyipopt would be somehow an optional dependency of botorch which is imported on execution

Yeah I think for now we can have IPOPT + cyipopt as optional dependencies, and just error out at runtime if they are not installed.

Any ideas regarding proper benchmarking of IPOPT in botorch?

I would probably just run this on a bunch of different acquisition functions on a bunch of different models fitted to data from synthetic functions, and then compare optimization performance (e.g. in terms of % improvements or # of best found solutions) an wall times. Maybe @SebastianAment still has some code sitting around for this from the logEI paper?

IPOPT can exploit hessians both for constraints and for actual objective function, this is currently not yet implemented, but would be interesting to see.

Indeed!

Do you have any experience with computing hessians of the acqf with respect to the inputs via autograd? Is this feasible?

We've looked into this in the past a bit; since the computation is fully in pytorch it should be possible to use built-in torch functionality for this, see e.g. https://pytorch.org/functorch/stable/notebooks/jacobians_hessians.html#hessian-computation-with-functorch-hessian. I recall that, in the past, there were some coverage gaps with jacrev and jacfwd on some of the linear algebra functions, but maybe those have been closed by now. Should be easy to give this a shot.

@jduerholt
Copy link
Contributor Author

Yeah I think for now we can have IPOPT + cyipopt as optional dependencies, and just error out at runtime if they are not installed.

So you prefer to include the dependency directly and not rewrite gen_candidates_scipy in a way that one can pass a callable to it which has the same signature as scipy.optimize.minimize? I am leaning a bit towards not including it explicitly as this would need less changes to the test suite. But I am totally open for both approaches.

Regarding the hessians, I tried several things which all leads to errors. The different approaches are based on the basic botorch example:

import torch
from botorch.models import SingleTaskGP
from botorch.fit import fit_gpytorch_mll
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch.acquisition import UpperConfidenceBound

train_X = torch.rand(10, 2, dtype=torch.float64)
Y = 1 - torch.linalg.norm(train_X - 0.5, dim=-1, keepdim=True)
Y = Y + 0.1 * torch.randn_like(Y)  # add some noise
train_Y = standardize(Y)

gp = SingleTaskGP(train_X, train_Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_mll(mll)

UCB = UpperConfidenceBound(gp, beta=0.1)


X = torch.tensor([[0.9513, 0.8491]], dtype=torch.float64, requires_grad=True)

I tried the following this (I am definitely no expert in torch autograd stuff :-)):

  1. torch.autograd.functional.hessian(UCB(X), X) leads to TypeError: 'Tensor' object is not callable
  2. Both jacrev and jacfwd (on which functorch.hessian is based) crash with a RuntimeError: Bad optional access in gpytorch/means/constant_mean.py in the forward. Sometimes, I even get a segfault. For this, I tried the following things (both with jacrev and jacfwd):
from functorch import jacrev, jacfwd

jacrev(UCB)(X)

and

jacrev(lambda x: UCB(x).sum())(X)

Any ideas on this?

@Balandat
Copy link
Contributor

So you prefer to include the dependency directly and not rewrite gen_candidates_scipy in a way that one can pass a callable to it which has the same signature as scipy.optimize.minimize? I am leaning a bit towards not including it explicitly as this would need less changes to the test suite. But I am totally open for both approaches.

Yeah that would work too. One downside is that it would be much less discoverable that way and a sort of "hidden feature". I guess we can cross that bridge when we know that this is working well. I may not get to dig very deep into the Hessian question for a while, have you tried just running this without providing Hessians?

I know @yucenli has been around folks in the past who did compute Hessian via autograd, maybe she has some thoughts here?

@jduerholt
Copy link
Contributor Author

Ok, I will start benchmarking it without the hessian for the GP. As soon as I have results, I will share them here ;)

@Balandat
Copy link
Contributor

@jduerholt any insights from the benchmarking?

@jduerholt
Copy link
Contributor Author

Unfortunately not, as I had no time to further work on it. But it is still on my menu, and I hope to find a motivated student to support me on this ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants